Gene Analysis Through Different Layer ========================================== This dataset contains 4,714 cells from the **Red Nucleus** region of the human midbrain, as part of the **Human Brain Cell Atlas**. For more details, refer to the `description `_. It is available for free download in **h5ad** format from the **CELLxGENE** website via this link: `Download Link `_. Retrieve the hierarchical structure of the data -------------------------------------------------- .. code-block:: python import requests import anndata import CellScope from scipy.sparse import issparse url = "https://datasets.cellxgene.cziscience.com/5488ff72-58ed-4f0d-913c-1b6d4d8412b1.h5ad" file_path = "Siletti-1.h5ad" response = requests.get(url, stream=True) if response.status_code == 200: with open(file_path, "wb") as f: for chunk in response.iter_content(chunk_size=8192): f.write(chunk) adata = anndata.read_h5ad("Siletti-1.h5ad") fea_raw = adata.X cell_types = adata.obs['cell_type'] label = np.array(cell_types) fea_raw,fea_log,fea = CellScope.cs.Normalization(fea_raw) fea_Fitting_1, Signal_Space, Center_index = CellScope.cs.Manifold_Fitting_1(fea) if issparse(fea_Fitting_1): fea_Fitting_1 = fea_Fitting_1.toarray() fea_Fitting_2, fitting_index, index_after_outlier_removal = CellScope.cs.Manifold_Fitting_2(fea_Fitting_1) T_all_1 = CellScope.cs.GraphCluster(fea_Fitting_1) T_all_2 = CellScope.cs.GraphCluster(fea_Fitting_2) Y_initial, label_step0, Y_1, Title_1, Y_all, Title_all, index_1, index_all, step0, step1 = CellScope.ts.generate_tree_structured(fea_Fitting_1, T_all_1, step0 = None, step1 = 8) CellScope.ts.tree_structure_visualization_static(T_all_1,step0,step1,Title_1,Title_all,Y_initial,Y_1,Y_all,index_1,index_all) .. image:: _static/tree_visualization_static.png Define the hierarchical structure to be considered ---------------------------------------------------------------------------------------------------- The considered hierarchical structure is: from **Cluster 3** to **Cluster 3-1** and **Cluster 3-2**, then to **Cluster 3-2-1** and **Cluster 3-2-2**. .. code-block:: python layer = [index_1[3],np.setdiff1d(range(T_all_1.shape[0]),index_1[3]),index_all[2],index_all[3],index_all[6],index_all[7]] Calculate the Wasserstein distance between nodes at the same layer ---------------------------------------------------------------------------------------------------- Based on the calculated Wasserstein distance, genes are categorized into **Housekeeper Gene**, **Type-Related Gene**, and **Type-Determining Gene**. Moreover, depending on their decisive roles at each layer, they are further divided into distinct **gene type conversion flows**. .. code-block:: python Res, label, label_str, flow_labels = CellScope.ga.Gene_Analysis(fea_log,layer) Count the number of each gene type at each layer. -------------------------------------------------- .. code-block:: python gene_counts = CellScope.ga.plot_sankey(label_str,save_fig=True, save_path='sankey_diagram.png') gene_counts ===================== ======= ======= ======= Gene Type Layer 1 Layer 2 Layer 3 ===================== ======= ======= ======= Housekeeper Gene 56844 58480 58990 Type-Related Gene 1578 435 214 Type-Determining Gene 814 321 32 ===================== ======= ======= ======= Sankey diagram of gene type changes between layers .. image:: _static/sankey_diagram.png We recommend viewing the Sankey diagram directly in Python. We used Plotly to create an interactive chart, making it easier for you to explore and analyze in detail.